This lab will explore the spatial distribution of climate damages across the US and the relationship between climate damages and areas with a high percentage of disadvantaged residents.
To run this lab, open a new, empty R script file.
Read through a section of the lab document and use CTRL-C then CTRL-V to copy a gray section of the lab code into your R file.
Examine the each block of code and try to understand what each statement does.
Sometimes it helps to first execute the code, view the results, then examine the code that produced the results.
Read the CEJST PowerPoint to remind yourself how the different individual disadvantaged categories are calculate, and how the overall category is calculated. Scan the CEJST website at
https://screeningtool.geoplatform.gov
CEJST’s basic spatial unit is the 2010 census tract. Tracts are largish neighborhoods averaging about 4,000 people. They are relatively stable over time. They are nested within counties in the same way the counties are nested within states.
Tracts are identified by an 11-character geoid that consists of fips codes for state (2 characters), county (3 characters) and tract (6 characters). This means the first two characters of the code identify the tract’s state, and the first five characters of the code identify the tract’s county. This make it very easy for R programmers to group_by() and summarize() tract-level data to states or counties.
Import the CEJEST dataset with the code below and spend some time browsing the extensive list of variables. Remember: all this data is now available to you at the tract (neighborhood) level!
Copy the code below into your R file, but skip the three lines that begin with “knitr”. Execute the code and examine the output R dataframes.
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warnings = FALSE)
library(tidyverse)
library(readxl)
library(sf)
library(RColorBrewer)
library(tmap)
library(plotly)
setwd("c:/temp/class2322")
source("functions.R")
# Task 1: Lab setup for CEJST data
# read national census tract CEJST data
j1 = read_excel("communities-2022-05-31-1915GMT.xlsx")
# create county geocode from first 5 characters of tract id
# total population, and disadvantaged population
j2 = j1 %>%
tolow() %>%
mutate(geocode = paste0("g", mid(census.tract.id,1,5))) %>%
select(geocode,census.tract.id,everything()) %>%
mutate(totpop = total.population,
dispop = ifelse(identified.as.disadvantaged=="TRUE",
total.population,0))
# add population and disadvantaged population by county
# calculate disadv as true when over 50% of the population
# is disadvantaged
j3 = j2 %>%
group_by(geocode) %>%
summarize(totpop=sum(totpop,na.rm=T),
dispop = sum(dispop,na.rm=T)) %>%
mutate(dispct = 100*dispop/totpop) %>%
mutate(disadv = ifelse(dispct > 50,TRUE,FALSE))
The j1, j2, and j3 files are the result of reading a spreadsheet of attribute data. There is no spatial data in those dataframes.
The CEJST developers have also released a giant(!) shapefile that contains the same attributes plus US 2010 census tract boundaries for all 74,000 US tracts. R seems to read the file effectively, but may hang if you try to draw all the US tracts using an R package like tmap.
The command below reads a spatial dataframe that has been filtered for Georgia tracts only. Unfortunately, the column names are brief and cryptic. Fortunately the developers have also given us a variable documentation file named “columns_csv” that nas explains what each variable is. Open columns_csv in a spreadsheet and scan through it.
gacejst = readRDS("gacejst.rds") %>%
# the two commands starting with "disadv =" are not a mistake.
# the first sets every tract to 0, even those with missing values.
# the second sets disadvantaged tracts to 1.
# the variable sm_c is a binary 1/0 variable representing whether
# or not the tract is disadvantaged.
mutate(disadv = 0,
disadv = ifelse(sm_c == 1, 1, disadv),
geocode = paste0("g",geoid10)) %>%
select(geocode,disadv,everything())
The following tmap code draws Georgia cejst tracts in an interactive map.
Zoom to the Georgia Tech area and identify disadvantaged tracts near campus.
tmap_mode("view")
tm_basemap("OpenStreetMap") +
tm_shape(gacejst) +
tm_polygons("disadv",alpha=0.25)